72 research outputs found
The Resource Usage Aware Backfilling
Abstract. Job scheduling policies for HPC centers have been extensively stud-ied in the last few years, especially backfilling based policies. Almost all of these studies have been done using simulation tools. All the existent simulators use the runtime (either estimated or real) provided in the workload as a basis of their sim-ulations. In our previous work we analyzed the impact on system performance of considering the resource sharing (memory bandwidth) of running jobs including a new resource model in the Alvio simulator. Based on this studies we proposed the LessConsume and LessConsume Threshold resource selection policies. Both are oriented to reduce the saturation of the shared resources thus increasing the performance of the system. The results showed how both resource allocation poli-cies shown how the performance of the system can be improved by considering where the jobs are finally allocated. Using the LessConsume Threshold Resource Selection Policy, we propose a new backfilling strategy: the Resource Usage Aware Backfilling job scheduling policy. This is a backfilling based scheduling policy where the algorithms which decide which job has to be executed and how jobs have to be backfilled are based on a different Threshold configurations. This backfilling variant that considers how the shared resources are used by the scheduled jobs. Rather than backfilling the first job that can moved to the run queue based on the job arrival time or job size, it looks ahead to the next queued jobs, and tries to allocate jobs that would experience lower penalized runtime caused by the resource sharing saturation. In the paper we demostrate how the exchange of scheduling information between the local resource manager and the scheduler can improve substantially the per-formance of the system when the resource sharing is considered. We show how it can achieve a close response time performance that the shorest job first Back-filling with First Fit (oriented to improve the start time for the allocated jobs) providing a qualitative improvement in the number of killed jobs and in the per-centage of penalized runtime.
Energy-Aware Lease Scheduling in Virtualized Data Centers
Energy efficiency has become an important measurement of scheduling
algorithms in virtualized data centers. One of the challenges of
energy-efficient scheduling algorithms, however, is the trade-off between
minimizing energy consumption and satisfying quality of service (e.g.
performance, resource availability on time for reservation requests). We
consider resource needs in the context of virtualized data centers of a private
cloud system, which provides resource leases in terms of virtual machines (VMs)
for user applications. In this paper, we propose heuristics for scheduling VMs
that address the above challenge. On performance evaluation, simulated results
have shown a significant reduction on total energy consumption of our proposed
algorithms compared with an existing First-Come-First-Serve (FCFS) scheduling
algorithm with the same fulfillment of performance requirements. We also
discuss the improvement of energy saving when additionally using migration
policies to the above mentioned algorithms.Comment: 10 pages, 2 figures, Proceedings of the Fifth International
Conference on High Performance Scientific Computing, March 5-9, 2012, Hanoi,
Vietna
Recommended from our members
Improved utilization and responsiveness with gang scheduling
Most commercial multicomputers use space-slicing schemes in which each scheduling decision has an unknown impact on the future: should a job be scheduled, risking that it will block other larger jobs later, or should the processors be left idle for now in anticipation of future arrivals? This dilemma is solved by using gang scheduling, because then the impact of each decision is limited to its time slice, and future arrivals can be accommodated in other time slices. This added flexibility is shown to improve overall system utilization and responsiveness. Empirical evidence from using gang scheduling on a Cray T3D installed at Lawrence Livermore National Lab corroborates these results, and shows conclusively that gang scheduling can be very effective with current technology. 29 refs., 10 figs., 6 tabs
Analyzing the EGEE production grid workload: application to jobs submission optimization
International audienceGrids reliability remains an order of magnitude below clusters on production infrastructures. This work is aims at improving grid application performances by improving the job submission system. A stochastic model, capturing the behavior of a complex grid workload management system is proposed. To instantiate the model, detailed statistics are extracted from dense grid activity traces. The model is exploited in a simple job resubmission strategy. It provides quantitative inputs to improve job submission performance and it enables quantifying the impact of faults and outliers on grid operations
The Influence of the Structure and Sizes of Jobs on the Performance of Co-Allocation
Over the last decade,much research in the area of scheduling has concentrated on single-cluster systems. Less attention has been paid to multicluster systems, although they are gaining more and more importance in practice. We propose a model for scheduling rigid jobs consisting of multiple components in multicluster systems by pure space sharing, based on the Distributed ASCI Supercomputer. Using simulations, we asses the influence of the structure and sizes of the jobs on the system’s performance, measured in terms of the average response time and the maximum utilization. We consider three types of requests, total requests, unordered requests and ordered requests, and compare their effect on the system’s performance for two scheduling policies, First Come First Served, and Fit Processors First Served, which allows the scheduler to look further in the queue for jobs that fit. These types of job requests are differentiated by the restrictions they impose on the scheduler and by the form of co-allocation used. The results show that the performance improves with decreasing average job size and when fewer restrictions are imposed on the scheduler.
Scaling of workload traces
Abstract — The design and evaluation of job scheduling strategies often require simulations with workload data or models. Usually workload traces are the most realistic data source as they include all explicit and implicit job patterns which are not always considered in a model. In this paper, a method is presented to enlarge and/or duplicate jobs in a given workload. This allows the scaling of workloads for later use on parallel machine configurations with a different number of processors. As quality criteria the scheduling results by common algorithms have been examined. The results show high sensitivity of schedule attributes to modifications of the workload. To this end, different strategies of scaling number of job copies and/or job size have been examined. The best results had been achieved by adjusting the scaling factors to be higher than the precise relation between the new scaled machine size and the original source configuration. I
- …